Category: Evolutionary Programming Discover Probabilistic Knowledge from Databases Using Evolutionary Computation and Minimum Description Length Principle

نویسندگان

  • Wai Lam
  • Man Leung Wong
  • Kwong Sak Leung
  • Po Shun Ngan
  • Tuen Mun
چکیده

We have developed a new approach (MDLEP) to learning Bayesian network structures based on the Minimum Description Length (MDL) principle and Evolutionary Programming (EP). It integrates a MDL metric founded on information theory and several new genetic operators including structure-guided operators, a knowledge-guided operator, a freeze operator, and a defrost operator for the discovery process. In contrast, existing techniques based on genetic algorithms (GA) only adopt classical genetic operators. We conduct a series of experiments to demonstrate the performance of our approach and to compare it with that of the GA approach developed in a recent work. The empirical results illustrate that our approach is superior both in terms of quality of the solutions and computational time for data sets we have tested. In particular, our approach can scale up and discover extremely good networks from large benchmark data sets of 10000 cases. Lastly, our MDLEP approach does not need to impose the restriction of having a complete variable ordering as input.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolutionary Computation

This chapter addresses the integration of knowledge discovery in databases (KDD) and evolutionary algorithms (EAs), particularly genetic algorithms and genetic programming. First we provide a brief overview of EAs. Then the remaining text is divided into three parts. Section 2 discusses the use of EAs for KDD. The emphasis is on the use of EAs in attribute selection and in the optimization of p...

متن کامل

Control of Inductive Bias in Supervised Learning using Evolutionary Computation: A Wrapper-Based Approach

In this chapter, I discuss the problem of feature subset selection for supervised inductive learning approaches to knowledge discovery in databases (KDD), and examine this and related problems in the context of controlling inductive bias. I survey several combinatorial search and optimization approaches to this problem, focusing on datadriven validation-based techniques. In particular, I presen...

متن کامل

Using Evolutionary Programming and Minimum Description Length Principle for Data Mining of Bayesian Networks

We have developed a new approach (MDLEP) to learning Bayesian network structures based on the Minimum Description Length (MDL) principle and Evolutionary Programming (EP). It employs a MDL metric, which is founded on information theory, and integrates a knowledge-guided genetic operator for the optimization in the search process. In contrast, existing techniques based on genetic algorithms (GA)...

متن کامل

Maintaining regularity and generalization in data using the minimum description length principle and genetic algorithm: Case of grammatical inference

In this paper, a genetic algorithm with minimum description length (GAWMDL) is proposed for grammatical inference. The primary challenge of identifying a language of infinite cardinality from a finite set of examples should know when to generalize and specialize the training data. The minimum description length principle that has been incorporated addresses this issue is discussed in this paper...

متن کامل

Evolutionary Induction of Sparse Neural Trees

This paper is concerned with the automatic induction of parsimonious neural networks. In contrast to other program induction situations, network induction entails parametric learning as well as structural adaptation. We present a novel representation scheme called neural trees that allows efficient learning of both network architectures and parameters by genetic search. A hybrid evolutionary me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006